Mutation screening of the CYP1B1 gene reveals thirteen novel disease-causing variants in consanguineous Pakistani families causing primary congenital glaucoma

Background Primary congenital glaucoma (PCG) is a heterogeneous rare recessively inherited disorder prevalent in regions with high consanguinity. Disease phenotype is associated with increased intra ocular pressure and is a major cause of childhood blindness. Sequence variations in Cytochrome P450 1B1 (CYP1B1) gene are a major cause of PCG. Current study was conducted to screen CYP1B1 gene in highly consanguineous PCG affected families from Pakistani population consistent with the autosomal recessive pattern of PCG inheritance. Methods For this study, patients and controls (clinically unaffected individuals of each family) from 25 consanguineous families belonging to Punjab, Baluchistan and Khyber Pakhtunkhwa, Pakistan were recruited through ophthalmologists. DNA was isolated from collected blood samples. Genetic screening of CYP1B1 gene was done for all enrolled families. In-silico analysis was performed to identify and predict the potential disease-causing variations. Results Pathogenicity screening revealed sequence variants segregating with disease phenotype in homozygous or compound heterozygous form in eleven out of 25 analyzed families. We identified a total of sixteen disease causing variants among which five frameshift i.e., c.629dup (p.Gly211Argfs*13), c.287dup (p.Leu97Alafs*127), c.662dup (p.Arg222Profs*2), c.758_759insA (p.Val254Glyfs*73) and c.789dup (p.Leu264Alafs*63), two silent c.1314G>A, c.771T>G and six missense variations c.457C>G (p.Arg153Gly), c.516C>A (p.Ser172Arg), c.722T>A (p.Val241Glu), c.740T>A (p.Leu247Gln), c.1263T>A (p.Phe421Leu), and c.724G>C (p.Asp242His) are previously un reported. However two frameshift c.868dup (p.Arg290Profs*37), c.247del (p.Asp83Thrfs*12) and one missense variant c.732G>A (p.Met244Ile), is previously reported. Furthermore, six polymorphisms c.1347T>C, c.2244_2245insT, c.355G>T, c.1294G>C, c.1358A>G and c.142C>G were also identified. In the intronic region, a novel silent polymorphism i.e., g.35710_35711insT was found in homozygous state. All the newly detected disease-causing variants were negative in 96 ethnically matched controls. Conclusion Among twenty-five screened families, eight families (PCG50, 52–54, 58, 59, 63 and 67) were segregating disease causing variants in recessive manner. Two families (PCG049 and PCG062) had compound heterozygosity. Our data confirms genetic heterogeneity of PCG in Pakistani population however we did not find molecular variants segregating with PCG in fifteen families in coding exons and intron-exon boundaries of CYP1B1 gene. Genetic counseling was provided to families to refrain from practicing consanguinity and perform premarital screening as a PCG control measure in upcoming generations.

CYP1B1 gene has three exons out of which the last 2 codes for a 543 amino acid (a.a) protein [15]. Cytochrome P4501B1 is a heme-thiolate monooxygenases that oxidizes multiple compounds including xenobiotics, steroids, retinoic acid and melatonin [5,15]. This membrane bound protein has a transmembrane domain at amino terminal (53 a.a) and a highly conserved cytoplasmic region (480 a.a) that is connected to amino terminal by a proline rich hinge (10 a.a) [16]. Exact function of CYP1B1 in development of eye is uncertain however it is believed that due to mutations in this gene, generation of some important morphogens is affected leading to structural defects in trabecular meshwork and the aqueous humour outflow pathways [5,6,17,18]. Up till now almost 270 mutations are reported in CYP1B1 gene including missense, small deletions, indels, gross deletions and regulatory mutations [19]. Studies have revealed several genetic mutations causing PCG from Pakistani population, but this data is still too limited as compared to high prevalence of this disorder in our population due to consanguinity [6,10,15,20]. In an ongoing effort of mutation screening of CYP1B1 gene in PCG cases belonging to consanguineous Pakistani families, we enrolled and screened twenty-five families for CYP1B1 variants. Each family had at least one child affected with primary congenital glaucoma.

Assessment and enrollment of patients
Based on clinical assessment provided by ophthalmologists, 25 diagnosed families of PCG were enrolled belonging to Khyber Pakhtunkhwa (KPK), Baluchistan and South Punjab. Clinical data, family history and blood samples of patients and each available family member was collected after informed written consent following the principles of world medical association of Helsinki [21]. The study was approved by Bioethical review Committee (BEC) of Quaid-i-Azam University (QAU) Islamabad, Pakistan. Inclusion criteria for patients was diagnosis of PCG through ophthalmologist based on symptoms like buphthalmos, edema and corneal cloudness. Patients with other eye diseases and PCG patients that did not belong to consanguineous couples were excluded from the study. Each family was given a unique identification number (PCG047-PCG069 and PCG101, PCG102). To draw pedigrees of affected families haplopainter program (http://haplopainter.sourceforge.net/about.html) [22] was used.

Extraction of genomic DNA
Average 4ml of peripheral blood sample was taken from each participating individual and stored in 5ml EDTA (Ethylene Diamine Tetra Acetic acid) vacutainer. Extraction was performed using non-organic method of DNA extraction described by Kaul et al., 2010 [23]. To check purity and quantify DNA, Nanodrop was used (Thermo Scientific Nanodrop spectrophotometers).

Amplification and sequencing of CYP1B1 gene
Amplification of coding regions and at least 50 base pairs of flanking non-coding regions was performed using primers reported previously by Afzal et al., 2019 [20]. 25μl polymerase chain reaction (PCR) was performed for both affected and non-affected individuals following protocol described by Afzal et al., 2019 [20]. After amplification, 1.5% agarose gel was prepared to load samples and controls along with DNA ladder (1kb) to separate bands according to their sizes. Purification was done using instructions provided by manufacturer of PCR purification kit (Wiz Bio Solutions, Seongnam, Korea). Finally, each amplified product was sequenced using big dye terminator ready reaction mix (Applied Biosystems, Foster City, CA, USA) in an automated ABI 3100 genetic analyzer. Sequencing results were analyzed by aligning them to reference sequence NM_000104.4 using Sequencher software (5.4.6) and Codon Code Aligner program to identify sequence variations. After identification of each variant, their disease causing potential was checked using Mutation taster (https://www.mutationtaster.org/) [24]. Furthermore, segregation with disease phenotype was confirmed by sequencing other available family members. Each novel identified sequence variant was checked in 96 control samples.
Average age of proband of each enrolled family was 10 ± 6 years. Ophthalmological findings confirmed diagnosis of congenital glaucoma for each proband. Proband of each enrolled family had consanguineous mating parents. Molecular screening of CYP1B1 coding regions and at least 50 base pairs of flanking noncoding region using DNA of each proband revealed thirteen novel disease-causing variations in coding regions according to mutation taster (Table 1, Figs  2 and 3).

CYP1B1 disease causing variations segregating in families with PCG
In family PCG049, two disease causing variations were detected. A single nucleotide substitution i.e., c.457C>G was present in heterozygous condition, it changed arginine at position 153 to glycine and was deleterious according to PROVEAN and Polyphen-2 with a score of -5.21 and 1.00 respectively ( Fig 3A) ( Table 3). The other missense heterozygous variant i.e., c.516C>A resulted in substitution of arginine at position 172 ( Fig 3B). Human splicing finder predicted that addition of glycine at position 172 will create new sites for auxiliary factors like exonic splicing enhancer (ESE) 9G8, exonic splicing suppresser (ESS) hnRNPA1, IIE, Fas ESS, Sironi_motif2 and break sites for EIE, ESE_SRp55 and Sironi_motif1 that might affect the protein structure. Insertion of thymine (T) in PCG050 in exon 2 at position 629-630 (Fig 2A) changed amino acid glycine to arginine resulting in frameshift and in-frame stop codon leading to truncated protein after 13 residues (Table 1). This homozygous variant was predicted as damaging for the protein structure according to in-silico analysis (PROVEAN -6.66 and polyphen-2 1.00) ( Table 3). Varsome predicted it as pathogenic and negative values determined by I-Mutant and MUpro showed destabilizing effect. HSF analysis for c.629dup predicted alteration of auxiliary sequences i.e., SRp55/SRSF6 (Serine and Arginine Rich Splicing Factor 6) ESE site TTCGGC and Fas ESS site TGTTTC was broken. Two new ESS sites TTTTCG and GTTTTC were created for IIE and one TGTGTTTT for PESS (putative exonic splicing silencer). In family PCG052, two homozygous variants c.722T>A (Fig 3C) and c.732G>A were found in second exon of CYP1B1 gene (Table 1). Variation c.732G>A was previously reported and described as less lethal than other variation according to pathogenicity prediction softwares. I-Mutant and MUpro gave negative values for both variations that depicts unstable protein structure (ΔΔG (kcal/mol) -0.62, -0.11 and -1.5608, -0.5974 respectively) ( Table 3). In family PCG053, insertion of single nucleotide at position c.287dup (Fig 2B) resulted in a frameshift and in-frame stop codon at 127 position i.e., p.Leu97Alafs � 127. This homozygous variant had deleterious effect on protein structure (PROVEAN score -4.09) and was predicted as pathogenic by Varsome (Table 3). This variation was predicted to create ESE/ESS site GGTTGCTG, GTGGTT and TAGTGGTT for factors ESE_SC35, Fas ESS and PESS respectively. Thermodynamics softwares predicted this variation as destabilizing but HOPE described that it might not cause disease because in rare cases this mutant residue was observed in homologous proteins. Two disease causing variants in homozygous condition were present in proband of family PCG054 among which one was already reported i.e., c.868dup that shifted the reading frame and truncated the protein after 37 residues i.e., p.Arg290Profs � 37 ( Table 1). The other variant was a novel frameshift variation c.662dup (p.Arg222Profs � 2) (Fig 2C). Decrease in the stability of protein structure was predicted by I-Mutant and MUpro giving negative energy values -0.51 and -0.852 respectively.
A reported homozygous frameshift variation was found in family PCG058 i.e., c.247del (p. Asp83Thrfs � 12) that replaced aspartic acid at position 83 to threonine shifting reading frame  (Table 3). Missense variations present in family PCG060 i.e., c.740T>A (p.Leu247Gln) and PCG067 i.e., c.724G>C (p.Asp242His) (Fig 3D and 3F) were predicted to be highly deleterious (PROVEAN score: -5.45 and -6.50) by pathogenicity prediction tools. Change of aspartate at position 242 to histidine resulted in creation and destruction of many sites for auxiliary factors that help in splicing. Sites for ESE_9G8, EIE, Sironi_motif2, PESE, Sironi_motif1 were broken and new site was created for ESE_SRp55CACGTG according to HSF.
Family PCG062 had two different heterozygous variants including c.1263T>A (Fig 3E) and c.1314G>A in coding regions (Table 1) showing compound heterozygosity. One of the detected variant c.1263T>A changed phenylalanine at position 421 to leucine and was reported as probably damaging by Polyphen-2. Human splicing finder predicted that a new acceptor site will be created CTGTGGTTTTTGTC>CTGTGGTTTTAGTC changing consensus value (CV) from 50.91 to 78.78. CV for newly created site showed that it is not a very strong site (strong site CV> 80). PCG063 had two unreported mutations, a silent heterozygous mutation c.771T>G and a frameshift homozygous mutation c.789dup (Fig 2D). Shifting the frame by 63 amino acids replaced leucine at position 264 to alanine and resulted in a short protein (Table 3). HSF analysis showed that CAGGCT site was created for ESS_hnRNPA1, CAGGCTCA for PESE, GCAGGC for ESE_9G8 and AGCAGC, AGCAGCTC sites were broken that are required for auxiliary factors ESE_SRp55 and PESE respectively.

Discussion
High frequency i.e., 70-100% of consanguineous marriages [34] is the main cause of high prevalence of autosomal recessive disorders like PCG in Pakistan [6]. Mutated CYP1B1 coded protein is reported to cause abnormal development of ocular structures resulting in impeded outflow of aqueous humor and PCG phenotype [35,36]. Data retrieved through studies have shown that mutational spectrum of in CYP1B1 gene varies among different populations i.e.; p. Ser476Pro is 44% prevalent in India, p.Arg469Trp, p.Arg368His, p.Arg390His, p.Gly61Glu and p.Glu173Arg are 70% prevalent in Iran, p.Gly61Glu, p. Arg390His and p.Glu229Lys are  [15]. Previous studies from Pakistan had reported p.Arg390His mutation to be implicated in more than 50% of analyzed PCG cases [6,20,38,39]; however, in present study, we did not identify this mutation in any of the analyzed case. A possible explanation of non-detection of p.Arg390His mutation in our study cohort could be the differences in ethnicities of analyzed subjects. In previous studies, PCG cases belonging to Punjab and Sindh provinces of Pakistan were included [6,38] however in present study majority of the families i.e., 13/25 belonged to Baluchistan province of Pakistan.
In present study CYP1B1 analysis in 25 cases enrolled through various regions of Punjab, Baluchistan and Khyber Pakhtunkhwa, Pakistan revealed a total of seven frameshift, seven missense and two silent disease-causing variations. Among seven frameshift variations five are novel however two are previously reported in patients of different ethnicities. The variant c.868dup (p.Arg290Profs � 37) (rs67543922) was initially identified in PCG affected Pakistani family by Sheikh et al., 2014 [40] and then in another family by Micheal et al., 2015 [39]. Mutational analysis of CYP1B1 conducted on population of Sindh and Punjab province of Pakistan by Rashid et al., 2019 [38] reported that out of total 427 individuals, c.868dup was found in two families that resulted in premature stop codon and eventually truncation.
Second reported frame shift variant c.247del (p.Asp83Thrfs � 12) was initially identified in a study conducted on Indian population by Tanwar et al., 2009 [41]. According to the study a stop codon TAG was introduced at position 94 due to frameshift after codon 82 [41]. Five homozygous frameshift variants including c.629dup, c.287dup, c.662dup, c.4_5insT, c.758_759insA and c.789dup detected in our study are not reported earlier from Pakistan or any other region. All homozygous variants identified in this study showed a perfect segregation with phenotype of disease in all families (Fig 2). Previously, Ou et al., 2018 [42] have shown that the active site residues of CYP1B1 are distributed from amino acid 126 to 510 of the protein therefore all truncations that omit one or more of these amino acids result in loss of protein function [43].
Missense disease-causing variants found in family PCG049, 052, 060, 062 and 067 and two silent disease-causing variants found in family PCG062, 063 are also previously unreported (Fig 3, Table 3). In CYP1B1 protein, novel missense variant p.Arg153Gly is located in C-helix, p.Ser172Arg in D-helix, p.Phe421Leu in K-helix, p.Val241Glu, p.Asp242His and p.Leu247Glu in substrate recognition site 2 [42]. The locations of residue replacements in conserved core structures highlight their possible severe affect on mutated protein structure and functionality hence causing disease phenotype [43,44]. Here in two families PCG049 and PCG062, we identified compound heterozygous mutations in CYP1B1 gene. Previously compound heterozygosity has been reported in developmental glaucoma, [45] and primary congenital glaucoma patients from China [46]. Cai et al., 2021 reported that two heterozygous mutations c.1310C>T (p.P437L) and c.3G>A (p.M1I) are responsible for glaucoma in a Chinese family [45]. In another study conducted on 13 Chinese PCG patients, two heterozygous mutations Ala330Phe and Arg390His were detected in a patient and reduced enzymatic activity due to these variants was reported to be the cause of disease [46]. Waryah et al., 2019 identified compound heterozygosity (p.Val364Met along with p.Pro350Thr) in two consanguineous families of PCG belonging to different ethnic groups of Pakistan [47]. Furthermore, previous studies have also reported co-segregation of heterozygous variants of CYP1B1 with heterozygous TEK alleles in PCG cases [48]. In present study, we identified a heterozygous variant p.Leu247Gln in a consanguineous family i.e., PCG060 and absence of any other heterozygous/homozygous variant in CYP1B1, recessive inheritance pattern and previously reported allelic interactions of two un linked genes for PCG phenotype [12,13,48] necessitates genetic analysis of other glaucoma related genes including MYOC, FOXC1 and TEK genes in PCG060 family.
Due to epigenetic modifications and different environmental factors incomplete penetrance and increased variability could be observed in manifestation of CYP1B1 disease causing variations in PCG patients [38,49]. We could not identify homozygous or compound heterozygous disease-causing variants in fifteen analyzed families in this study that predicts the contribution of other genes like LTBP2, TEK, MYOC, FOXC1 and regulatory effect of cis-acting elements, splicing elements or possible modifiers [12,13,40,50].

Conclusion
In conclusion we identified thirteen previously unreported and three reported mutations as well as six SNPs (one novel) in PCG probands born to parents having consanguineous marriages highlighting the autosomal recessive pattern of disease. Proper genetic testing and counseling should be provided to people in high consanguinity areas to help ophthalmologists in disease management and treatment. Mass screening and additional studies are required to better understand the heterogeneous pattern and contribution of CYP1B1 gene to PCG pathophysiology in our population.